Ching-Hua Yu (cyu17)
Summary: Because using np.linalg.solve(A, b) to solve Av = b requires A to be a square matrix, I was first using scipy.sparse.linalg.lsqr(A, b)[0]. However, althought the resulting image looks similar, this led to some small error due to the approximate solution. In fact, there are redundant conditions, and to recover the exact image, we do only need the same number of conditions as the pixel variables. Hence I revised the conditions, making A to be a square matrix and solve v with error 0. In addition, using scipy.sparse.csr_matrix and scipy.sparse.linalg.spsolve(A, b) can improve the efficiency.
Summary: Poisson blending can extend the background color from the boundary of the foreground and make it smooth, though it doesn't preserve much about the texture from the background. Nevertheless, for relatively smooth background, the result is pretty well, as show above.
Some note: 1) Well use cv2.resize in testing since it takes time. Note that dim in cv2.resize and bottom_center used in are in the order of (width, height), different from the image matrix. 2) We can think more about reducing the condition number for about equal approximation result. 3) For the pixels on any boundary, I finally set the condition to be the background value for the following reason. The align_source in utils.py is imperfect in that it cut the exact boundary according to the mask. However, in the algorithm, when we compute e.g., b[e] = s[ys[i],xs[i]] - s[ys[i]+1,xs[i]] + t[ys[i]+1,xs[i]], s[ys[i]+1,xs[i]] would become 0 on the boundary and b[e] can be larger than 1. Nevertheless, I think the boundary condition here can also simply set to b[e] = t[ys[i]+1,xs[i]] (which implies v[ys[i],xs[i]]=t[ys[i]+1,xs[i]]).
Next, we see a failure case and another example.
The above case is less optimum in that the foreground cut doesn't have boundary pixels similar/harmonic to that of the background part. We can see that penguin's black head has been unevenly whiten. This can be somehow improved in the gradient domain processing later.
As another example, I tried to replace one of the two birds facing together with a third one.
(The two pictures are attributed to Microsoft Corporation (dektop theme of Colorful Birds) and to http://listenbirds.blogspot.com resp.)
Summary: As we can see from the result, when we consider the gradient to be either from the foreground image or the background image according which one has a larger absolute value, we can have much more background features retain in the patch, while preseving important features of the foreground image.
As another example, I put mickey on a lawn, as below. (The lawn image is from https://trugreenmidsouth.com)
Summary: I first try to use the same number of conditions as the variables similar to that in toy_reconstruct. Even if I balanced the number of conditions over rows and that over columns, the image (while recovered) is lost smoothness in some part. (e.g., if most conditions are row-based, then the output image lost some verticle smoothness (in some part).) (It can be worth to check back on the minimum number of conditions, or understanding whether about 2mulcol conditions are necessary.)
For now, I use 2mulcol-1 conditions and solve it approximately. Essentially I do
where s(x,y) are the gray-scale of the input (i.e., mean of the 3 colors), and gradx (resp. grady) is the gradient among the 3 colors with the largest magnitude (i.e., the absolute value). There can be some alternative here, e.g., taking the mean, or taking only the sum of gradients with the major sign, etc.
Hint: 1) Remember to normalize the input by /255.0, or at least convert it to float type; otherwise, there will be overflows and result in an incorrect result. 2) To test, it's better to use a small region of the image or use cv2.resize, because processing a median size image already takes time.
Anothe example is shown below.
Summary: In addition to the theory in e.g., http://graphics.cs.cmu.edu/courses/15-463/2005_fall/www/Lectures/Pyramids.pdf, there are two challenges in practice. First, bringing the background texture into the forgound while retaining the high frequency components in the foreground. Second, eliminating the boundary of the mask. When the image is of light color, this is not easy achieved by just applying gaussian filter to the mask. I finally use both the gaussian filter and fine tuning the alpha mette in the pyramid based blending to achieve this.
Other note: cv2.pyrUp doesn't always results in the same size of the same level of cv2.pyrDown so it's better to use cv2.resize instead.
The implementation follows the concept of non-photographic (e.g., https://grail.cs.washington.edu/projects/gradientshop/demos/gs_paper_TOG_2009.pdf), but use a variant and simplified approach. Essentially, we want to enhance the gradient of clear edges while suppressing the gradient of small pots. Here I use the laplacian filter to characterize the edge feature of the image, and then I sum up the local edge features around a region according to a fineness factor (5) (by applying simple np.ones mask). Next I calculate its power to a factor of suppression (1.5) and nomalize it, and then use it to estimate the saliency of the image.